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Abstract We study a mean field approximation of the M/M/oo queueing sys¬ 
tem. The problem we deal is quite different from standard games of congestion 
as we consider the case in which higher congestion results in smaller costs per 
user. This is motivated by a situation in which some TV show is broadcast so 
that the same cost is needed no matter how many users follow the show. Using 
a mean-field approximation, we show that this results in multiple equilibria of 
threshold type which we explicitly compute. We further derive the social op¬ 
timal policy and compute the price of anarchy. We then study the game with 
partial information and show that by appropriate limitation of the queue-state 
information obtained by the players we can obtain the same performance as 
when all the information is available to the players. We show that the mean- 
field approximation becomes tight as the workload increases, thus the results 
obtained for the mean-field model well approximate the discrete one. 
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1 Introduction 

This paper is devoted to the problem of whether an arrival should queue or 
not in an M/M/oo queue. It is assumed that the cost per customer decreases 
with the number of customers. 

In a wireless context, the M/M/oo queue may model the number of calls 
in a cell with a large capacity. The assumption that the cost per call decreases 
with the number of calls is typical for a multicast in which the same content 
is broadcast to all mobiles, so that the cost of the transmission can be shared 
among the number of calls present. 

Our first objective in this paper is to study the structure of both individ¬ 
ual as well as globally optimal policies. Our analysis reveals that there exist 
threshold type of policies in which an individual is admitted if the number of 
ongoing calls exceeds some threshold (whose value depends on whether glob¬ 
ally or individually optimal policies are considered). 

The assumption that the cost decreases with the number of customers 
distinguish our model from the standard congestion control problems which 
consider that cost increases with the number of customers. The structure of 
both globally and individually optimal policies can thus be expected to be 
quite different than those standard congestion control problems which have 
been studied for over half a century starting with the seminal paper of Pinhas 
Naor [10]. Naor had considered an M/M./l queue, in which a controller has to 
decide whether arrivals should enter a queue or not. The objective of his paper 
was to minimize a weighted difference between the average expected waiting 
time of those that enter, and the acceptance rate of customers. Naor then 
considered the individually optimal policy (which can be viewed as a Nash 
equilibrium in a non-cooperative game among the players) and showed that it 
is also of a threshold type with a threshold bigger than that of a centralized 
model. His result revealed that arrivals that join the queue under individual 
optimal policy wait longer in average compared to the global optimal policy. 
Finally, he showed that there exists some toll such that if it is imposed on 
arrivals for joining the queue, then the threshold value of the individually 
optimal policy can be made to agree with the socially optimal one. Since this 
seminal work of Naor there has been a huge amount of research that extend the 
model: More general inter arrival and service times have been considered, more 
general networks, other objective functions and other queuing disciplines have 
also been considered, see e.g. [17,14,13,7,8,3,6,1,12] and references therein. 

The importance of the fact that a threshold policy is optimal is that in or¬ 
der to control arrivals we only need partial information - in fact we only need 
a signal to indicate whether the queue length exceeds or not the threshold 
value >F. The fact that this much simpler information structure is sufficient for 
obtaining the same performance as in the full information case motivates us to 
study the performance of threshold policy and related optimization issues for a 
non co-operative game with partial information setting. We first study the full 
information setting where each individual is only optimizing its own cost and 
explicitly obtain that there exists a plethora of threshold type of symmetric 
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Nash Equilibrium (NE) strategy profiles. Subsequently, we compare the social 
cost under NE strategy profile with the globally optimal social cost. Then, we 
consider the individual optimization problem with partial information; where 
we send a green signal if the queue length exceeds the value S' - and a red signal 
otherwise and each individual player will select strategy in order to optimize 
its own social cost. We note that by using this signaling approach instead 
of providing full state information, users cannot choose any threshold policy 
with parameter different than S', and so in the individual optimization case, 
one could hope that by determining the signaling according to the value if - 
that minimizes the social cost, one would obtain the socially optimal perfor¬ 
mance i.e. the global optimal cost will be achieved. We show that this is not 
the case here; in fact, we observe that as in [ 2 ], where similar approach was 
proposed for an M/M/1 queuing system, the performance obtained under the 
best possible signaling policy (in the partial information case) achieves the 
same performance as equilibrium under the full information. 

We study here a simplified mean field limit of the M/M/oo queuing system 
rather than the actual discrete model since on one hand it is much simpler to 
handle and solve than the original discrete problem (we obtain closed-form for¬ 
mulas for all the equilibria) and on the other hand the approximation becomes 
tight as the workload increases. To show that, in this paper we establish the 
convergence of the game to its mean field limit under appropriate conditions. 

The model and some results from sections 4, 8 and 9 appeared in conference 
proceedings as [16]. 

The organization of the paper is as follows: We introduce the discrete model 
in Section 2 and its mean-field approximation in Section 3. In Section 4 we 
find equilibria of the mean-field model, while in Section 5 its social optimum. 
In Section 6 we study the version of the game with partial information. In 
Section 7 we show that the information can be limited without affecting the 
performance. In Section 8 we establish the convergence of discrete models to 
the mean-field one as the workload increases. We numerically evaluate the 
threshold policies in Section 9. Finally, we conclude the paper with some re¬ 
marks in Section 10. 


2 The Model 

2.1 Discrete Model 

We consider a service facility in which an arriving customer can observe the 
length of the queue ( X t ) upon arrival. We interchangeably denote X t as system 
state. The value of service is 7 and the cost of spending time in service can be 
computed as an integral of the cost function c(-) over the service time with c(-) 
- a continuous decreasing function of the number of users in the queue. An 
arriving customer can either join the queue or leave without being served. The 
decision is made upon arrival. The situation is modeled as a M/M/oo system 
with incoming rate A and service rate /i. 
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A customer k arriving at time t k chooses whether to enter the queue ( E ) or 
not ( N ). It follows that the set of pure actions for any customer is V = {E, N}. 
Since the decision that he makes is based on the length of the queue, a policy 
(or a strategy) of any customer will be a mapping 1 nk : S —► A(V) (since the 
set V is only a two-point set, we will identify 7 Tk with a function from N to 
[0, 1], describing the probability it assigns to action E ), where S C R denotes 
the set of possible system states (in the discrete model S = N). In what will 
follow we will assume that the users limit their policies to the sets of so-called 
impulse or threshold policies, defined below. 

Definition 1 A policy 7 Tk of a user is called an impulse policy if there are 
finitely many points xi,...,x n € S, with Xq := inf S, x n +\ := sup S', such 
that 7 Tk is constant on any interval (x k , Xk+ 1 ), k = 0,..., n. 

A subclass of the set of impulse policies with very simple structure are 
threshold policies. 

Definition 2 A policy 77 *, of a user is called an [0, q\-threshold policy if 

( 0 if x < 0 

7 T k {x) = < q if x = 0 (1) 

[ 1 if x > 0 

At time t, an incoming client who employs this policy joins the system if the 
queue length, X t , is bigger than 0, while if X t = 0 he does so with probability 
q. Otherwise he never joins the queue. 

The cost of a user k arriving at time t, k is defined as follows: 

rtk+Ck 

C k (X tk )= c (X t ) dt — 7, 

Jt k 

where cr k is user k’s service time. 

For each multi-policy tv = ( 771 , 772 ,...), let [n' k ,Tv^ k ] be the policy which 
replaces 77 k by 77 ^ in tv. Now we are ready to define the solution we will be 
looking for: 

Definition 3 A policy Tv k is an optimal response for user k against a multi¬ 
policy tv if 


E [Ck (Xt ([77 fc ,77- fc ]))] < E [Ck (Xt ([77^,77-*]))] (2) 

for every policy iv' k of player k. 

Definition 4 A multi-policy 77* = ( 771 , 772 ,...) is a Nash equilibrium (NE) if 
policy of every user k is the optimal response for user k against 77 *, for every 
k. If inequalities (2) are true up to some £ > 0, we say that 7r* is an e-Nash 
equilibrium. 


1 For any finite set A, A(A) denotes the set of all probability measures on A. 
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3 Fluid Model 

In what follows we will mostly analyze the fluid approximation, which can be 
viewed as the weak limit of the system (scaled in a proper way) as the arrival 
rate of players goes to infinity (see e.g. [15]). Now, we describe the fluid model. 

The system state (the length of the queue) X t £ R + . Consequently, the 
policies of the players are defined on M + . The customers arrive at the queue 
according to a fluid process with rate A. As each of them uses some policy 
7Tfc, the real incoming rate at time t is n(X t )\ where Tv(X t ) is the average 
strategy of the arriving users. Each of them stays in the queue an exponentially 
distributed time with parameter /q and so the outflow is according to a fluid 
process with rate /-iX t . This can be described as the following ODE: 

{x t (n)=W(X t )\-LiXt(Tr),Vt>0 
\*o = x 0 

Since there are infinitely many players in the game now, we encounter 
problems with defining the multi-policies. For that reason we assume that in 
multi-policy 7r all the players use the same policy n. If we want to write that 
only one player, say player k changes his policy to some 7r(,, we write that 
players apply policy [7r~ fc ,7r(J, meaning that each player uses policy n except 
player k. Also note that the game is symmetric since each player has the same 
payoff function and strategy space, thus, it is very difficult to implement an 
asymmetric Nash Equilibrium- we elucidate the inherent complications con¬ 
sidering only two players: If in an NE ^ then, by the symmetric nature 
of the the game, (7T2,7 t*) is also an NE. If player 2 knows that player 1 se¬ 
lects 7r* (7T2, respectively), then the optimal response for player 2 is to select 
7T2 (7T* (respectively), but player 2 can not know the selection of player 1 due 
to the non co-operation between them. Under symmetric NE, all players se¬ 
lect the same strategy and thus the above complication is somewhat alleviated. 
Moreover, 7f is not always well defined, but in such a case 7r(X t ) = Tr(X t ). Also 
with these assumptions, both the cost and the equilibrium can be defined as 
in the discrete model. 


4 Equilibria of the Fluid Model 

In this section we characterize the equilibrium points of our game. We begin 
by characterizing the evolution of the system state in case all the users apply 
the same impulse policy. 

Lemma 1 Suppose all the players (except maybe one) apply the same impulse 
policy 7T. Then if the initial state of the system is Xq, then X t is continuous in 
t for any Xq and is nondecreasing in Xq- 

Proof It is clear that for 7f = 7r having finitely many discontinuity points, the 
(non-classical) solution to the equation (3) is well-defined a.e. and continuous 
in t. 
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Next, suppose that Xq < x' G and there exists a s such that 2 X s [xo] > X s [x'q\. 
X t is continuous in t, thus by the intermediate value property there exists a 
t* < s such that X t »[xo\ = X t *[x' 0 ]. But in both cases and at each time all 
users apply the same policy 7 r, depending only on the current state of the 
system, thus for any t > t* X t [xo] = X t \x' 0 \, which is a contradiction, as we 
assumed that X s [xq] > X s [x'^\. □ 


We have one immediate corollary of the above lemma. 


Corollary 1 The expected cost of a player joining the queue at time tk, when 
all the other players apply policy 1 r 


E 



c(X t (7r)) dt 


-7 


(4) 


when ~ Exp(fT), is decreasing in X tk . 3 


Note that in the above corollary we have replaced x 0 with X tk . This is 
justified, as the coefficients of (3) depend on t only through X t . Corollary 1 
has an important consequence which is stated in the lemma below: 


Lemma 2 Any best response to a symmetric impulse multi-strategy n is a 
threshold strategy. Moreover, the best response is unique up to the value of q 
(see (1)). 


Proof A player k arriving at time tk has only two pure actions: to enter the 
queue (E) or not to enter the queue (N). When he uses the former, his cost is 


E 


/»tfc +£Tk 


c(X t (ir))dt 


't k 


- 7 , 


with <7k ~ Exp(/n), which is by Corollary 1 decreasing in X tk . On the other 
hand, when k uses action N, his cost is 0. Thus, if k prefers to use action E 
for X tk = xi, he will also prefer it for X tk = X2 > X\. Similarly, if he prefers 
to use N for X tk = x 2 , he will also prefer it for X tk = xi < X2- Finally, as the 
cost of using E is strictly decreasing in X tk , there may only exist one point 
where k is indifferent between E and N and so he may choose to randomize. 
Moreover, in any other point the best response is uniquely determined. □ 


An immediate, but very important consequence of Lemma 2 is the follow¬ 
ing: 

Corollary 2 In any symmetric 4 equilibrium to our queuing game any player 
uses a threshold policy. 

2 We shall write Xt[x 0 ] for the value at t of the solution to (3) when A'o = xo- 

3 The fact that we have strong monotonicity here, even though we had weak monotonicity 
in Lemma 1, is a consequence of the continuity of Xt , which implies that a trajectory starting 
at time tk in a bigger Xt k stays above the one starting in a smaller X' tk on some interval, 
which affects the integral in (4). 

4 The result can be generalized to the asymmetric case, but it would require some technical 
assumptions to make sure 7f is well-defined. 



Mean-field Game Approach to Admission Control of an M/M/oc Queue 


7 


Remark 1 Note that the equilibrium specifies the action to take at any state, 
including states that are in practice never reached. If a state x is never visited 
then any variation of the equilibrium at states larger than x will not change 
the performance of any player. Yet since we allow for any initial state, there 
may be customers that will find the system at states that are transient and 
will not be visited again. Therefore specifying the equilibrium in such states is 
considered to be important in game theory. Equilibria that are specified in all 
states including transient ones, are known as perfect equilibria. It can also be 
shown that such equilibria are good approximations of those that we obtain 
in case that there is some sufficiently small constant uncontrolled inflow. This 
follows from [5]. 

Assuming that all (except maybe one) users apply the same [(9, g]-threshold 
strategy, we may write explicitly the evolution of the system state X t : 

Lemma 3 Suppose the initial state of the system is Xo and that all the users 
(except maybe one) apply the [0 , q]-threshold policy. Then the system state at 
time t can be explicitly written as: 

(a) If Xq > O and 0 < ^ or xq = O < — then 

X t = - + fx 0 - -) 

T \ Tj 

(b) If x 0 > 0 > - then 

x, = jp + ( l0 ~ i/ ‘ e [0, ‘ < “ o,, ‘ )1 

1 0e KW)-t) ift>i (X0t e), 

where f (xo ,e) = ^ log 

(c) If Xq = 0 = ^7 then 



T 

(d) If £ 0 < 0 or xo = 0 > ^ then 

X t = a; 0 e _Ait . 


Proof We know that when p is a constant, the solution of the equation 

(X t =p\- pX t ,\/t>0 

] *0 = Xo 


is 




e 


— fit 


(5) 
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Note that, when p = 1, this means that X t —> - monotonically when t —> oo. 
Thus, if Xq > 0 and -> 0, X t never leaves the region where policy n 
prescribes to use action E, and so 

X t = - + (x 0 - e _Mt . 

M V /V 

Similarly, note that for p = 0, decreases monotonically to 0, thus when 
Xo < 0, X t never leaves the region where policy n prescribes to use action N, 
and so (5) reduces to 

X t = x 0 e _M *. 

Now suppose that Xq > 0 > Then X t starts in the region where x prescribes 
to use action E with probability one, which implies that its trajectory decreases 
towards ^ until time t( Xo ,o) when it reaches the threshold 0. From then on x 
prescribes to use action N with probability 1. It is easy to compute that for 

Xt = - + (x 0 - e -Alt . 

Since by definition t( XQ ,e) is such that Xj ( = 0 , we easily obtain that 
t(x 0 ,e) = ^ lo g-^rx- Then, for t > t( X0)@ ), X t has to satisfy (5) with p = 0 
and to = t(x Q ,0) instead of 0, which gives 

X t = .«)-*). 

Finally, when So = 0, X t satisfies at t = 0 (5) with p = q. If xq = by (5) 
X t = —. Otherwise if xo < —, X t moves upwards and for t > 0 behaves like 
when xq > 0 , while if Xq > —, X t moves downwards and for t > 0 behaves 

A 4 

like when Xq < 0. □ 


Now, to simplify the notation, we will make use of the fact that all the 
players use threshold policies. Let us define 

Ck ( X , £/_/;;)) 


to be the expected service cost for player k if he enters the queue when its 
state is x and all the players except k apply a [0-k, < 7 _fc]-threshold policy. Cfc 
can be written as 


d k (x,(0. k ,q. k ))= E ^~ EX P(M),* 0 -Z f k fe c(Xt(7r )) 

-Jt k 

n r 

c(Xt)pe~ dt dr. 


The following lemma gives exact ways to compute Ck in each of the cases of 
Lemma 3. 
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Lemma 4 C k can be computed using following formulas: 

(a) If x > O-k and O-k < ^ or x = O-k < 9 ~^ A then 5 


0k {x, {O—k, Q—k)') — 


1 


A — xp , 


c(u) du. 


(b) If x > O-k > ^ then 


Ok [x, {O—k , Q—k')') — 


1 


xn - A J e 


o(u) du- 


O-kP — A 
0- k p(xp - A) J 0 


rO-k 


z{u) du. 


(c) If x = O-k = then 


Ok {x, ( O—k, Q—k')) C 


1 / q- k X\ 1 


d V d 


= -c(O-k). 

d 


(d) If x < O-k or x = O-k > 


Q—k^ 


then 


C k {x, {O-k, q-k )) = — / c(w) du. 

^d Jo 

Proof Suppose x > O-k and O-k < - or x = <9_fc < g ~ fcA . Then by (a) of 
Lemma 3 


Ok {x, {O—k, q~k )) — 


A 


/o UO Vd 

which can be further written as 

/>oo /»oo 


A 


c ( — + ( x — — ) e ) /re A1T dt. dr 
d/ 


io Jt 


A 


d 


A 


A* 


c —h a;-e M /re dr dt 


lo Vd 


A 


+ ( x - - ) e"^ 4 ) e"^ 4 dt = 
d 


1 


A — x/u 


c(w) du. 


Next, suppose that x > <9_fc > Then, by (b) of Lemma 3 Ck {x, {O-k, q~k)) 
equals 


lie dr 


5 In the degenerate case when x = (x,(0-k,q~k)) = which is the limit of 

the expression in (a) when rr —► —. We will use similiar convention throughout the paper, 
putting fff f(u)du = f(a), if needed. This will reduce the number of cases considered 
in subsequent results, without affecting the validity of any of them. 


f min{r,t (a: ,e_fc)} / ^ / y 

/ cl Tlx-| e 

/ 0 Vd V d 


— /it 


dt 


/min{r,t (X)@ _ fc) } 


c{0- k e^ x ’ e - k) t] )dt 
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which can be further written as 


ft(w,0-k) f 00 


+ hr — 




fie Mr dr dt 


+ 


' 4 (x,e-fc) 


ciO-ke^' 0 -^ V) fMe-nT drdt 


1 (x,0 — k) / 


v 


X 




- + x - - e - ^ 4 e-^dt + 


t(x,© — k) 




1 


1 


r0_i» 


-- / c(u) citt H-=- 

xn-\J e _ k 0_ k e^ e -^ p, ■ 

1 / 


0 


_ r , w , . 

xp — X J 0 k CU U 9- k p{xp - A) J 0 


O-k 


z{u) du 


1 ) dtt. 


Now, if x’o = 6>_fc = q , then by (c) of Lemma 3 

' q-kX 


C k (*E) {0—k, q~k )) — 
' q~kX 


>0 jo 


pe MT dt dr 


l l 


rpe~» T dr = - c ( q - kX 


d \ d 


Finally, when £0 < 0 or Xq = 0 > q * X , we can apply part (d) of Lemma 
3, obtaining 

/♦OO pT 

C k (x, (0- kl q- k )) = / / c{xe~ flt )pe~ tJ ’ T dtdr 

Jo Jo 

which can be further written as 

poo poo poo -1 px 

/ / c{xe~pe~dr dt = / c{xe~ ^)e~ ^ dt =— / c(u) du. 

Jo Jt Jo Jo 


In next two lemmas we characterize the best responses to any given thresh¬ 
old strategies. 

Lemma 5 [0 k , q k ]-threshold policy is a best response of player k to a [0 - k , q~ k ]- 
threshold policy used by all the others if 0 k is obtained by finding the unique 
solution to the equation 

C k {0k, (' 9-k,q~k )) = 7 - ( 6 ) 

and taking any q k . If equation (6) has no solutions then 0 k is taken as the 
only value such that 

C k (x, {0- k , q-k )) < 7 for x > 0 k and C k (x, {0- k , q-k )) >7 for x < 9 k 

(7) 

and qk = 1 if the first inequality is satisfied for x = 0 k , while qk = 0 if the 
second one is satisfied for x = 0 k . 


dt 
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Proof First note that C k ( x , (0- k , q~k)) — 7 is exactly the expected cost of 
player k if he joins the queue when its state is x, while his cost when he does 
not join is 0. Moreover, the expected cost of player joining the queue is by 
Corollary 1 monotone decreasing function of x. Thus, equation ( 6 ) may have 
at most one solution, and the cost of joining the queue for x > Ok is negative, 
that for x < Ok is positive, while that for x = Ok is 0, regardless of qk . Thus 
[Ok, (/^-threshold policy always gives player k the smallest cost available. 

Similarly, when ( 6 ) has no solutions, from the monotonicity of the cost 
Ck (x, ( O-k, q-k))~ 7, there must exist exactly one Ok such that Ck (x, (0- k , q~ k )) — 
7 > 0 for x < Ok and C k (x, (O-k, Q-k)) — 7 < 0 for x > Ok- Now we can re¬ 
peat the arguments from the proof of the first part of the lemma, to show that 
[Ok, 1]- or [Ok, 0 ]-threshold policy is the best response to [O-k, (/-^-threshold 
policy of the others in this case. □ 


Lemma 6 Let [Ok,qk]~threshold policy be a best response of player k to a 
[0_k,q-k]-threshold policy used by all the others and define O and O as the 
unique solutions to the following equations 6 : 


,0 


1 /V 1 

-/ c(u) du = 7 , yr— / c(u) du = 7 . 

X- Op Jo y 'Op J 0 


( 8 ) 


Then Ok and qk satisfy the following: 

(a) If 7 £ ^0, ^ liniu-^oo c(u)] then Ok. = oo and qk is arbitrary (which means 


that the best response is a policy never prescribing to enter the queue). 
, c(u), j Jo* c(u ) dii) then 


( b ) J f 7 € ( l lim 


0 k (0-k) = < 


O-k, 


for 0- k < min {<9, ^ j 
for min j<9, ^ j 


4 }• < O-k < 4 


O(O-k), for a < O-k < O 


(®, 


for O-k > O 


where O is some uniquely defined function on satisfying 0(x) > x. 

“{ s 7}c 


qk is arbitrary for O-k ^ 


min [o, - \ 

A 

L l ’ i 1 / 

’ V. 


, while for O-k G 


0, if O-k > 9 l) X or O-k = and c > PI 

q k (O-k, q-k) = ^ arbitrary , if 0- k = and c = M7 or ®-k = 0 < 

1 , 


if O-k < or O-k = and c < PJ- 


6 If there are any solutions. 
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(c) If ry£ 


A 

i U c ( u ) du > 



then 


Ok{G>-k) 


f 0- k , for O-k < 9 

\ G, for 9- k > 9 


qk is arbitrary for 9-k > 9, while for 9-k < 9, 


{ 0, if 9-k > ^ or 9-k = ^ and c > n 

q k {9- k ,q-k) = < arbitrary , if 9- k = and = /17 

[ 1 , if 9-k < or 9-k = and c (^jr) < IH- 

(d) If 7 > -c(0) then 9k = 0 and qk = 1 (which means that the best response 
is a policy always prescribing to enter the queue). 


Proof To show (a) first note that any form of Ck described in Lemma 4 is 
bounded below by - inf „> 0 c(u), which equals - lim.^c*-, c(u), as c is a strictly 

decreasing function. Thus, in case 7 < 7 lim^^oo c(u), also 7 < Ck (x, (9-k, q~k )) 
for any value of x, thus (7) is satisfied for 9k = 00 . This means that the strat¬ 
egy never prescribing player k to enter the queue is his best response to the 
[9-k, g_fc]-threshold policy used by all the others. 

Now suppose the assumptions of part, (b) of the lemma are satisfied. Note 
that the function 

A 

C(x) := -- [ c(u) du 

A - x/i J x 

_ A _ 

is continuous on M + and satisfies lim x _ ) . 0 + C(x) = ^ fj 1 c(u) du and linx^oo C(x) = 
7 lim^^oo c(u). Thus, by the intermediate value property, and since 

(1 1 fTi \ 

7 G — lim c(u), — / c(u) du , 
yyiu^oo A J 0 J 


there exists an x such that C(x) = 7, but this is exactly how 9 is defined. 
Similarly, the function 

C(x) := — [ c(u) du 

XU Jo 

is continuous on R + and satisfies C_ = j / 0 M c(u) du and lim^^oo 9( x ) = 
limu^oo c(u). Thus, again by the intermediate value property, and since 7 G 
limu^oo c(u), j c(u) dii'j , there exists an x such that 9( x ) = 7 , which 

is how 9 is defined. Moreover, 0 is always bigger than 

Next note that by Lemma 4, Ck (x, (9-k, q~k)) equals C(x) if x > 9-k 
and 9-k < 7 or x = 9-k < ^ 77 ^, and C(x ) if x < 9-k or x = 9-k > 
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——. Thus by Lemma 5, Ok = & for O-k < min { O, -V Ok = 0- k for 
min j<9, A | < 0_ fc < A, O k > 0- k for A < 0_ fc < 0 and O k = 0 for 
O-k > O. The values of qk for O-k < - depend on the relation between 0_£ 
and : If the former is smaller, for Ok = O-k we are in the set where 
C k (Ok,{0- k ,q-k)) = 0(6>fc) < 7i and so = 1. If 0_ fc = we are in 
the set where Ck {Ok, {O-k, q~k)) = ~ c {0~k), thus according to Lemma 5 the 
value of qk depends on the relation between A c{0-k ) and 7, exactly as it is 
written in Lemma 6. Finally if 0- k > q =^, C k {O k , {0- k ,q- k )) = Q{O k ) > 7 , 

h 1 

and so qk = 0. 

To finish the proof of part (b) of the Lemma, we need to show that for 
O-k £ ^ A, o'j, Ok > O-k■ To do that, it is enough to prove that for any fixed 

O-k £ ^A,0^j, Ck {x, {O-k, q~k)) is continuous at x = 0- k as a function of 

x. If it is, then from the fact that for x = 0- k , C k {x, {0- k , q~k )) = Q(x) > 7, 
also Ck {x + e, {0- k , q~ k )) > 7 for some e > 0, and thus O k defined by Lemma 
5 is not smaller than 0- k + e. Thus fix 0- k £ ^,Q^j and take x n -A 0+ fc - 
For such x n , 


Ok { x n, {0—k,Q—k)) — 


Xn/J, - X 


:(w) du 


Q-kji - a r 

0-kH{x n ll - A) J o 


z{u) du 


O-k/i-X O-kH 


c{u ) du = C{0- k ) = C k {O-k, {O-k, q - k )), 


which proves the desired property. 

To prove part (c) of the lemma note that since now 7 £ c{u)du, dc( 0 )^ s 

the equation C(x) = 7 has no solutions. Moreover, its LHS is always smaller 
than its RHS. On the other hand, since lim^^ 0 + O(x) = ~ c(0) and C ^A^ = 

A 

X Jo* c{u) du, by the intermediate value property the equation C_{x) = 7 has 
a unique solution 0 £ ^0, A^. Next, again by Lemma 4, C k {x, {0- k , q~k)) 
equals C{x) if x > 0- k and 0- k < A 0 r x = 0- k < , and C(x) if 

x < 0-k or x = O-k > q ~ kX , and thus by Lemma 5, O k = 0- k for 0- k < 0 
and Ok = 0 for 0- k > 0. The choice of q k is made exactly as in part (b) of 
the lemma. 

Finally, suppose that 7 > Ac(0). Then for any value of x, C {x, {0- k , q~ k )) < 

A 4 

7 , and thus the optimal response of player k to the [O-k, g_fc]-threshold strat¬ 
egy of all the others is always to join the queue. □ 


Now we are ready to state the main result of this section. 
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Theorem 1 The game under consideration always has a symmetric equilib¬ 
rium where each of the players uses the same [0, q]-threshold strategy. More¬ 
over: 

(a) If 7 £ ^0, 7 liniu-^oo c(u) then the equilibrium is unique, with 0 = oo, 
which means that the equilibrium policies prescribe every user never to 
enter the queue. 


0>) If 7 € ( Mim. u 


, c{u), j Jq m c(u) duj then there are infinitely many equi- 

\ . 


libria, whose forms depend on the relation between 0 and 
(bl) If 0 < 7 then there are equilibria of five types: 0 — 0 and any 
q > 0 = 0*, with 0* satisfying c(0*) = pj and q = 0 = 0 


and any q £ [ 0 , 1 ],' any 0 £ 




and q = 0 ; any 0 £ 


0 ,$ 


and q = 1 . 


(b2) If 0 = — then either 0 = 0 and q £ {0,1} or 0 = 0 and q is any 
number from [ 0 , 1 ]. 

(b3) If 0 > 7 then either 0 = 0 and q is an arbitrary number from [0,1] 
or 0 = — and q = 0 . 

(c) If T € 7 U c(u) du, ^c( 0 )^j then there are infinitely many equilibria of 

three types: with 0 £ [ 0 , 0 ] and q = 0 ; with 0 £ [ 0 , 0 ] and q = 1; with 
0 = 0* satisfying c( 0 *) = and q = 

(d) If 7 > ^c(0) then the equilibrium is unique, with 0 = 0 and q = 1, which 
means that the equilibrium policies prescribe every user to always enter 
the queue. 

Proof A strategy for any player k will induce a symmetric equilibrium if it is 
a best response to itself. Below we analyze which strategies may satisfy this 
condition. 

In case (a) it is obvious by (a) of Lemma 6 that the policy prescribing 
never to join the queue is always the best response to itself, and since this is 
the only best response to any policy, this is the only possible equilibrium. 

In case (b) 0 has to be either in interval min{0, 7 }, 7 or equal to 0. 
In the latter case it is clear that for any q the [0, q]-threshold policy will be 
the best response to itself. I 11 the former one, q and 0 must satisfy one of the 
following conditions: 

<7 = 0 and 0 > 7 ^ = 0 , which is always true for 0 > min | 0 , 7 j, so [ 0 , 0 ]- 
threshold policies form equilibria in this case. 

< 7=1 and 0 < ^ = 7, and so [ 0 , l]-threshold policies form equilibria for any 
0 G [min { 0 , A}, A). 

<7 = 1 and 0 = 77 = 7 with c ^7^ < 7/x, which is always true as long as 
©< 7 - 

0 = 0 < —, which implies that q > ^ 7 . It can always be satisfied when 0 is 



Mean-field Game Approach to Admission Control of an M/M/oo Queue 


15 


as assumed, so [ 0 , g]-threshold policies form an equilibrium in this case. 

and c(0) = /ry. Note however that by the definition of 0 and 
continuity of c, if 0 < ^ then there must exist a solution 0* to the equation 


0 = 




c( 0 ) = 747 in 




so 0* and q = is an equilibrium. In particular, if 


0 = then also 0* = - and q = 1 is one. 
m’ m h 

In case (c) 0 has to be in interval [0,0] and needs to be related to q in 
one of the following ways: 

<7 = 0 and 0 > — = 0 or 0 = 0 with c( 0 ) > /i 7 , which is always true in case 

(c)- 


<7 = 1 and 0 < ^7 = 7 , which is always true, as 0 < 0 < ^ in this case, 
which was shown in the proof of Lemma 6 . 

0 = ^ < 0 and c(0) = 717 . Note however that by the definition of 0 and 
continuity of c, there must exist some 0 * in the interval ( 0 , 0 ) such that 
c( 0 *) = 717 , so 0 * and q = is the only equilibrium in this case. 

Finally, in case (d), by (d) of Lemma 6 it is obvious that the policy always 
prescribing to join the queue is the best response to itself. Since this is the only 
best response to any policy in this case, this is the only possible equilibrium. 
This ends the proof of the theorem. □ 


Remark 2 It should be noted here that there are multiple equilibria in certain 
situations. In that case, it is normally not clear which one would prevail. 
Nevertheless, as the cost of being served is a decreasing function of 0 and 
of q for a fixed value of 0 , we may assume that the customers will naturally 
choose the equilibrium strategies with the biggest values of 0 and q. In Section 
5 we will, nevertheless, analyze the social outcome of all the possible equilibria, 
comparing them to the social optimum. 


5 Social Optimum 

The social cost associated to some symmetric strategy profile 7 r can be com¬ 
puted using equality 


C(xq, 7r) = 7^00 (* 0 )E[C fc ( *£<30 (7r, a: 0 ))], 


where x^tz^Xq) denotes the stationary state of the queue when the players 
apply multipolicy tv and initial state of the queue is Xo: while n oc (xo) is the 
limit value of strategy n when time goes to infinity (note that it may have 
three values, depending on whether the trajectory of X approaches £00 (tl £ 0 ) 
from above, from below, or is from some point constant). 
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If we assume that tt is a [<9, g]-threshold policy, C(x o, 7 r) equals: 

C(x 0 ,(9,q)) := 

' C k { 

Xoo (iz,x o ),{ 0 ,q)) ~ 7 , 

when Xoo(7r,xo) > 0 or Xoo(7r,xo) = 0 and 77 ^ (x 0 ) > x 0 )) 

< q[Ck(x oo ('K,x 0 ),{O,q))- 7 \, 

when Xoo( 7 r,Xo) = 0 and n^xo ) = 7 r(Xoo( 7 r,x 0 )) 

. 0, when Xoo( 7 r,Xo) < 0 or ^(tt, x 0 ) = 0 and ^^(xo) < 7 r(xoo( 7 r, x 0 )). 

Note however that, as it can be clearly seen from Lemma 3, when everyone 
uses the same threshold policy, the only stationary states possible in the game 
are 0, ^ and ^7. Moreover, they can by easily deduced from the values of 0, 
q and Xo, and thus the following lemma is true. 

Lemma 7 Suppose all the players in the game apply the same [0, q}-threshold 
policy tt. Then social cost function in the game can be computed as follows: 

(a) If Xq > 0 and 0 < — or x 0 = (9 < ^7 then 

C(x 0 , (<9, q)) = j l (c Q) - 7pj ■ 

(b) If Xq = 0 = ^7 then 

e(x„,(e, ? )) = l(c(^)- 7M ). 

(c) In any other case C(xo,(0,q)) =0. 

Proof If xo > 0 and 6 ><^orxo = <9<^ then by parts (a) of Lemmas 3 and 
4, X t = -+(x 0 - - ) -h-)-oo - = Xoo(tv,Xo) and C k (x oo( 7 r,x 0 ), (0,q)) = 

p V /v t L 

-c ( - ). Finally TToofxo) = 1, as either 0 < -, and so 7 r = 1 in some neigh- 

bourhood of or 0 = ^ < xo, and then the trajectory of X approaches ^ 
from above, where 7 r prescribes to take action E with probability 1. Putting 
all this together we get that C(x 0 , {0, q)) = (j) -7 = ^ (c - 7 m) • 

Next, suppose that Xo = 0 = ^ 7 . Then by parts (c) of Lemmas 3 and 4, 
X t = £ 7 , so this is also its stationary state, with C k ( 77 , (0,q)^j = j^c (^ 7 ) 
and 7 Too (^ 7 ) = Q- Putting all this in the definition of C we obtain C(x 0 , (0, q)) 

If x 0 > 0 > 7 , by (b) of Lemma 3 for t large enough, X t = _> t 

0 . Similarly, if xq < 0 or xq = 0 > ^7 then by (d) of the same lemma, X t = 
XqG~ tit —>t-y 00 0, and so in both cases x^-n^xo) = 0 < 0 . This implies either 
Xoo(tt, x 0 ) < 0 if 0 > 0 or Xoo(7r,Xo) = 0 and 0 = ^(xo) < tt(x oc (tt, x 0 )), 
and therefore C(x 0 , (@,<7)) =0. □ 
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Using this lemma we can easily find strategies minimizing the social cost 
for any Xq- 

Theorem 2 fa) If c ^ < 7 /t then the social optimum equals I (^j^j — jfij 

and is attained for the strategy profile consisting of [0,1 \-threshold strate¬ 
gies of all the players, prescribing to always join the queue. 

(b) If c(^^J = 7 /t then the social optimum equals 0 and is attained for any 
symmetric strategy profile consisting of [O, q\-threshold strategies such that 

(-) -f " A - 

(c) If c (^j > 7/1 then the social optimum equals 0 and is attained for the 
strategy profile consisting of [oo,0 \-threshold strategies of all the players, 
prescribing never to join the queue. 

Proof Suppose < 7 /t- Then I ( c (u) ~ 7 m) < 0 and so it is always 

more profitable to be in case (a) of Lemma 7 than in case (c). As c is a strictly 
decreasing function, also J ( 77 ) — 7 / 1 ^ < 7 — 7 /ij. Thus a strategy 

profile such that the assumptions of case (a) of Lemma 7 are satisfied for any 
Xo minimizes the social cost function then. It is straightforward to see that 
when all the players use [ 0 , l]-threshold strategies this is the case. 

Next assume that c = 7 / 1 . Then I ( c (^) — 7A*) = 0 < 7 ^c(^^— 7 / 4 ^, 
so the social cost is minimized in cases (a) and (c) of Lemma 7. Thus any pro¬ 
file of policies guaranteing that case (b) is never possible, which is equivalent 
to 0 —, is optimal in this case. 

Finally let c (j) > 7 / 1 . Then ± (c ( 7 ) - 7 / 1 ) > 0 and J (c - 7 / 1 ) > 

0, so case (c) of Lemma 7 is better than cases (a) and (b). Using [ 00 ,1]- 
threshold policies guarantees satisfying the assumptions of case (c) for any 
x 0 . □ 

5.1 Price of Anarchy and Price of Stability 

A commonly used concept for evaluating the equilibria in any given game 
is that of Price of Anarchy, introduced by Koustoupias and Papadimitrou 
[9], which is the ratio between the cost of an equilibrium and that of the 
optimal solution. As in our game there may exist multiple equilibria, each 
gives a different social cost, we would like to adapt here the concept of two 
quantities describing quality of equilibria [4]: Price of Anarchy, being the ratio 
between a worst (in terms of its social cost) equilibrium’s cost and the optimal 
social cost, and Price of Stability, defined as the ratio between the cost of 
a best equilibrium and that of the optimal solution. The problem in using 
these quantities in our model could be that here, unlike in network congestion 
games, the social cost may be both negative and positive (and in fact it often 
equals zero). Note however that in any situation a player can guarantee himself 
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zero cost, so both in social optimum and in equilibrium it is never positive. 
This suggests defining Price of Anarchy and Price of Stability in the following 
manner: 


PoA(x o) 


C (Xfl , TTQpt ) 

max ——- f—, 

K<EAf£ C(xo,ir) 


PoS(x o) 


C (xo, TTOpt ) 

mm -T— 

ttgNE C{Xo,Tt) 


Here Af£ denotes the set of Nash equilibria in the game, while it o P t is an 
optimal policy profile in the game. Also it is important to note that in both 
definitions we use conventions that g = 1 and g = +oo for a negative value of 
c, so we treat 0 as 0 _ . 

The following theorems characterize PoA and PoS in our model. They are 
direct consequences of Theorems 1 and 2, and Lemma 7, and thus we state 
them without proofs. 


Theorem 3 The Price of Anarchy: 

(a) is infinite if 7 € c ( u ) du 


and x’o < 0 ; 

(b) equals 1 otherwise. 

Theorem 4 The Price of Stability: 


or if 7 £ 


IJ 


c(u) du, ^c(O) 


(a) is infinite if 7 £ ^ j , j Jf 1 c(u) duj and x 0 < 0 ; 

(b) equals 1 otherwise. 

As we can see, both PoA and PoS take only two values, 1 and 00 . This 
is a consequence of the fact that the social cost of equilibrium happens to be 
greater than that of optimal solution only if the former equals zero while the 
latter is negative. 


6 Fluid Model with Partial Information 

In this section we assume that the knowledge of each user when he decides on 
entering the queue, is limited to the information whether the state of the queue 
is above some threshold <T or not. Thus instead of X t £ R + , the system state 
perceived by the players will be X t £ {0,1}, with X t = 0 denoting X t < P 
and X t = 1 denoting X t > T. Consequently the strategies of the players will 
be of one of the forms EE, EN, NE or NN, where the first letter stands for 
the strategy in state 0, while the second one for the strategy in state 1. Using 
arguments from section 4 we can argue that strategy EN will be never used, 
so for the ease of analysis we will only consider the three remaining ones. It 
is also important to note that these three strategies can also be interpreted 
as threshold strategies in the original game, only with the set of thresholds 
available limited to {0, T, 00 } (for policies EE, NE and NN respectively). We 
will analyze the equilibria in this model and, in particular, how they depend 
on the value of T. Then we shall check how this affects the social welfare. 
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We assume that the knowledge of each of the players is limited to the value 
of the threshold F and the partial information about the state X t . Thus the 
users will assume that the actual state of the queue at the time they decide 
on entering the queue is the one for which the cost of joining the queue is 
the highest. By Corollary 1 this cost is decreasing in X t , thus the players will 
assume X t = 0 if X t = 0 and X t = F if X t = 1. 

We will need some additional notation to formulate our main results. Let 
Lee, Lee and Lee denote the worst-case service cost for a player i entering 
the queue with X t = 0, when all the other players apply strategy EE, NE or 
NN, respectively. Similarly, let Hee , Hee and Hen denote the worst-case 
service cost for a player i entering the queue with X t = 1, when all the other 
users play EE, NE or NN, respectively. We can use the interpretation of 
the policies in our new model as threshold strategies in the original game and 
Lemma 4 to obtain: 


L EE (F) = C k { 0 , (0,1)) = j J c(u) du, 

Lee{^) = C k ( 0, ('f', 1)) = -c(0), 

d 

L N n{'I / ) = Cfc( 0, (oo, 1)) = -c(0), 


A 

A ’ 


All the the main properties of functions L s and H s , s = EE, NE, NN, are 
summarized in the following lemma. 

Lemma 8 For any F > 0 

(a) Lee{'L) = Lee ('f') > Lee(&) and Hee{&) > Hee ( | f r ) > Hee{'L), with 
Hen('L) = Hee{&) > Hee('L) when F > ^ and Hee(&) > Hee(L) = 
Hee 0^) when F < -, 

(b) Lee{'L) > Hee{’L) and Lee('L) > Hee(&)- 
Proof By the monotonicity of c we can write: 


1 11 / f* 

Len = —c(0) > -- / c(u) du = L E e 

V d A/Ai - 0 J 0 


H EE (F) = C k (F, (0,1)) = 


X — Ffi, 


?(it) du, 


H NE (F) = C k (F, (F,l)) = 


Wjl So c ( u ) du > when ^ > 
U c ( u ) du > when 'L < 


H N e(^) = C k {F, (oo, 1)) = — / c(u) du. 

'Ld Jo 
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and (the inequality is true both in case 0 < F < L and when F > j j : in the 
degenerate case when F = A the RHS reduces to A c(F ), which is obviously 
also smaller than the LHS): 

11 11 

Hnn = ~~p -y / c(u ) du > -- / c(u ) du = H EE . 

d P - 0 Jo M A/m - P 

This establishes the strict inequalities in (a). The equalities are direct conse¬ 
quences of the formulas for Hnn, Hn E and H EE written before the lemma. 
Part (b) of the lemma also follows from the monotonicity of c, as: 


1 11 

Lnn = —c(0) > — ——- / c(u)du = H N n 
d d W — o Jo 


for P > 0 and 


Lee = - 1 


d A/m - 0 Jo 


1 1 

c(u) du > — - 


m a/m — p , 


c{u ) du = H ee . 


□ 


Now we are ready to formulate the main result. 

Theorem 5 For any P > 0 the game with partial information has a pure- 
strategy worst-case equilibrium. Moreover: 

(a) When 7 > Lnn(P) then all the players use policy EE in equilibrium; 

(b) When Lnn(P) > 7 > L EE (P) and 7 > Hnn(P) then strategy profiles 
where all the players use policy EE and where all the players use policy 
NE are equilibria; 

(c) When Hnn{P) > 7 > ma x{L ee (P), Hne(P)} then any strategy profile 
where all the players use the same policy is an equilibrium; 

(d) When Hne(P) > 7 > L EE (P) then strategy profiles where all the players 
use policy EE and where all the players use policy NN are equilibria; 

(e) When L EE (P) > 7 > Hnn(P) then all the players use policy NE in 
equilibrium; 

(f) When min{L EE (P), Hnn(P)} > 7 > Hne{P) then strategy profiles where 
all the players use policy NE and where all the players use policy NN are 
equilibria; 

(g) When min {L ee (P), Hne(P)} > 7 then all the players use policy NN in 
equilibrium. 

Proof By Lemma 8 cases (a)-(f) cover all the possible situations in the game. 
Then each of the cases follows directly from the definition of pure-strategy 
Nash equilibrium. □ 

The following information about how the functions L s and H s behave 
when F changes can be immediately derived from their definitions and the 
monotonicity of c. 
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Lemma 9 For any s € {EE,NE,NN}, L S (F) is constant, while H S {F) is 
nonincreasing in F. Moreover: 

(a) lim^o H NN (F) = ^c(O), 

A 

(b) lim ^^.0 H NE (F) = lim^^o H EE (F) = \ / 0 M c(u) du, 

(c) lim^oo H NN (F) = lim^oo H NE (F) = lirn^oo H EE (F) = ± Hindoo c(u). 

Using this lemma we can prove, how the worst-case equilibria depend on 
the value of the threshold F. 

Theorem 6 Worst-case equilibria in the game with partial information de¬ 
pend on F in the following way: 

(a) If 7 < I lim^^oo c(u) then all the players use strategy NN in the equilib¬ 
rium, regardless of F. 

(b) If 7 G ^lim u—> OO C c(u ) du'j then for F small enough all the 

players use policy NN in the equilibrium, while for F approaching infinity 
all the players use policy NE in the equilibrium. 

(c) If IS U" c(u) du, ^c(O)^ then for F small enough there are three equi¬ 
libria, in which all the players use the same policy, which is any of EE, 
NE or NN policies, while for F approaching infinity either all the players 
use policy NE or all the players use policy EE in equilibrium. 

(d) If 7 = ^c(O) then there are two equilibria regardless of F, where all the 
players use the same policy, which is either EE or NE. 

(e) If 7 > ^c(O) then all the players use strategy EE in the equilibrium, 
regardless of F. 

Proof (a) If 7 < - lim^oo c(u) then by Lemma 9 H EE is always bigger than 7 . 
Consequently, by Lemma 8 also H EE > 7 and L EE > 7 and thus by Theorem 
5 all the players use policy NN in the worst-case equilibrium. 

(b) If 7 € lrniu^oo c(u ), j U c(u) dit^j then by Lemma 9 for F small 

enough also H EE {F) > 7. L EE {F) is is independent of F and always bigger 
than 7, then by Theorem 5 all the players apply policy NN in the worst-case 
equilibrium for such F. O11 the other hand, if F is big enough, Hn E (F) < 7. 
Since, as already mentioned, also L EE (F) > 7, by Theorem 5 the strategy 
profile where everybody plays NE is the only worst-case equilibrium for such 
F. 

(c) If 7 <E j fff c(u ) du, Ic( 0)^ then H NE ( 0) < 7 and H NE (F ) < 7 for 

any bigger F. L EE is independent of F and by assumption smaller than or 
equal to 7. So, as long as F satisfies Hnn(F) > 7, Theorem 5 implies that 
profiles where everybody uses the same strategy, which is any of EE, NE or 
NN are equilibria. But for F close to 0, H EE (F) is close to -c(0) > 7. Next, if 
F approaches infinity, L EE (F) < 7 < ^c(O) = Ln E {F) and H EE ('I r ) goes to 
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^ c(u) < 7 , thus for E big enough we have two worst-case equilibria, 

where either all the players use policy NE or all use policy EE. 

(d) If 7 = -c(0) then for any value of E, both H^n and Lee are smaller 
than 7 . On the other hand, Lnn = 7 and so by Theorem 5 for any E there 
are two worst-case equilibria, where either all the players use policy NE or all 
use policy EE. 

(e) If 7 > -c(0) then Lnn is always smaller than 7 , and thus the only 
worst-case equilibrium for any value of E is when everyone applies policy EE. 

□ 

Remark 3 Note that, the limits when E is taken to infinity and zero make 
the signal completely uninformative on the state of the system. Thus, we can 
easily derive form Theorem 6 the equilibria in our game when the queue is 
completely unobservable. It is enough to look at the action prescribed to be 
taken above the threshold when E —> 0 or the one below the threshold when 
E —► 00 . It turns out that in cases (a) and (b) uninformed players should not 
enter the queue, in cases (d) and (e) they should enter the queue, while in 
case (e) there are two equilibria where players either enter or do not enter the 
queue. 

Roughly speaking, Theorem 6 suggests that by increasing E we can increase 
the set of global states for which the players would enter the queue. Since this 
will also affect the stationary state of the queue, which, as we can see from 
section 5, is crucial for the social welfare, it seems that by a proper choice of 
E we can make the social welfare very close to its optimal value. We study 
the above idea in the perspective of social cost in section 7 where a hierarchy 
will be introduced in the game with the social planner choosing E at the 
first stage, and then the users playing the partial-information game from the 
present section at the second one. 


7 Introducing Hierarchy to Boost the Performance of Equilibria 

In this section we assume that the game is played in two stages. In the first 
stage the social planner, having all the information about the game, including 
the actual value of * 0 , chooses E and announces it to the players. His goal 
is to minimize the social cost C{x 0 ,7r) by appropriately limiting the data 
available to the players. On the second stage the users play the game considered 
in section 6 using all the information they have, which only consists of the 
announced value of E, assuming that the state of the queue when they decide 
about entering is the worst possible. We will see that this kind of hierarchical 
formulation can reduce the social cost of equilibrium. 

We first study how equilibria in the hierarchical model will look. Towards 
this end, we consider the pessimistic and the optimistic case. In the pessimistic 
setting the social planner chooses E in order to minimize max^gy^ C{x o,tt), 
so he assumes that whenever the players choose their strategies, they choose 
the equilibrium which yields the highest social cost. In the optimistic case the 
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social planner chooses If - minimizing min-TreMf C(xq, 7r), assuming that the 
players choose the equilibrium which yields the lowest social cost. J\f£ above 
denotes the set of Nash equilibria of the game of the second stage. The result 
is summarized in the following theorem: 


Theorem 7 In the worst-case hierarchical model: 


(a) Ifl< Minx 


u-^oo c(u) then the social planer chooses any E, with all the 
players using strategy NN in the equilibrium. 


(b) If 7 G ^ lim, u _j. 0O c(u), M then the social planner chooses any E < 

0 and all the players use policy NN in the equilibrium. 

(c) If 7 e (±c (*) ,i f} c(u) du) then the social planner chooses E = 0 
with all the players using policy NE in the equilibrium. 


(d) If 7 G 


Uo 


c(u ) du , -c( 0 ) 


then in pessimistic case the social planner 


chooses E = 0 and all the players use policy NE in equilibrium: in the 
optimistic case the social planner chooses any E and all the players use 
policy EE in equilibrium. 

(e) If 7 > ^c(O) then in the optimistic case the social planner chooses E = 0 
while all the players use strategy 1 EE or NE in the equilibrium. 


Proof In cases (a) and (b) the social planner wants the players never to enter 
the queue. In case (a) never entering the queue is the equilibrium, regardless of 
E. In case (b) forcing players to use NN policies requires choosing threshold 
E such that H^e^) > 7- If we compare the definition of Hne with ( 8 ), we 
obtain that !P < 0, as for 7 < ^ j. Hme{&) can only obtain the 7 value 

for E > -. 

n 

In cases (c)-(e) the social optimum is achieved if players always enter 
the queue. Thus the social planner forces the players to use EE policies if 
possible (optimistic scenarios in (d) and (e)). In case (d) this means choosing 
any value of threshold E, in case (e) this means choosing E = 0 , so that two 
equilibrium policies NE and EE were equivalent. If forcing the players to 
use EE policies is impossible, the social planner chooses the lowest possible 
E such that the players would use NE, and not NN policies in equilibrium. 
In case (c) this means the smallest E such that Hne(E) = 7, which for 

c(u ) duj equals 0 (in such a case H ne (E) obtains the 7 

value both for E < ^ and E > ^-). T 11 the pessimistic variant of case (d) this 
means choosing E such that Hnn^) = 7, which, by the definition of Hnn 
and ( 8 ) is equivalent to E = 0. □ 

Remark 4 Note that in Theorem 7 we did not consider the case of 7 = M . 
This is because in this case the social cost of any policy (used by all the players) 


7 


For & = 0 they are equivalent. 
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equals 0. Thus the social planner may choose any value of T. This however 
may result in different equilibria in the game of the second stage. 

We further analyze, how this result affects Price of Anarchy and Price 
of Stability in our model. Both these quantities are computed ex post , that 
is, we assume that all the users have their knowledge about the state of the 
queue limited when they make their decisions, but PoA and PoS are computed 
when all the state information is revealed. This allows us to compare the 
results obtained in the hierarchical model with the ones obtained for the full 
information case. The result presented below is an immediate consequence of 
Theorem 7, and thus presented without proof. 


Theorem 8 In worst-case hierarchical model: 

1. The Price of Stability is infinite if 7 £ (n) ’ X Jo* c ( u ) du 

0. Otherwise it equals 1. 

2. The Price of Anarchy is infinite if 7 e { // c{u) du 

i/o" du, ^0(0)^ and Xq < 0- Otherwise it equals 1. 


and Xq < 


or if 7 G 


As we can see from Theorem 8, when only information available for the 
players is an indicator of state being above or below T, then both PoA and 
PoS stay the same as in the model with full information. Thus we can claim 
that we can reduce the information given to the players without degrading 
the performance. O11 the other hand, we cannot improve it only by choosing 
appropriate signal to send. 


8 Approximation of the Discrete Model 

In the section below we present a result which joins the equilibria of the fluid 
model with e-equilibria of the discrete model when the incoming rate is suffi¬ 
ciently high. To formulate it, we need to introduce some additional notation, 
differentiating between the discrete and the fluid model. Let us start with fix¬ 
ing that the function c and parameters A and p define the fluid model, whose 
state will be denoted by X t . Then let M n be a discrete model with service cost, 
c n (x) = c (-), incoming rate X n = nX and service rate p. The state in model 
M n will be denoted by X", while 

AT := -X" 
n 

will be its normalized state. Using this notation we can formulate the main 
result of this section and its proof. 

Theorem 9 Suppose that the initial (normalized) state of the queue Xq € 
[0, x max ] for some fixed x max and that the user k plays against [6*, q\-threshold 
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policies of all the others (denoted shortly as ir policies) in the fluid model with 
service cost c, incoming rate A and service rate fi. Then for any e > 0 there 
exists an N such that for any n > N his expected cost from entering the queue 
in the discrete model M n , 


E [CZ(X?( 7T"))] = E 


ftk+CTk 


c n (X?(n n ))dt 


>t k 


- T, 


where ir n denotes a [n0, q\-threshold policy (which is a proper rescaling of 
policy 7T to fit M n ), differs from the expected cost E [Ck(X t (TT))\ in the fluid 
model by at most e. 


Proof Let us consider two policies for the discrete model M n : 


0, when x < n(0 — (3) 

n l3,n (x) = ^ , when x G [n(0 — (3),nQ] 

1, when x > nO 


and 


! 0, when x < nO 

x f^ e , when x G [n0,n(0 + (3)\ . 

1, when x > n(0 + (3) 

They are rescalings of the following policies for the fluid model: 


*(*) = 


0, 

_ ) x-O+fi 


1, 


P 


and 


when x < 0 — (3 
, when x G [0 — (3, 0] . 
when x > 0 

when x < 0 
tt 0 ( x ) = ^ when x G [0,0 + (3] . 


0 

x—O 


when x > 0 + (3 

These policies differ from [0, gj-threshold policy 7r only on sets (0 — j3,0) 
or (0, 0 + (3) respectively. Next, consider equation (3) when all the players 
apply policy T&. It can be directly computed that the solution X t (ir l3 ) has the 
following properties: 

1. XtfjfP) -> X t (it ) pointwise as (3 —> 0. 

2. Whenever X t (T&) fL (0 — (3,0 ), it is of the form X t (lfP) = + D 2 

for some constants D\,D 2 with |Z?i| < max|xo, ^ j < max ^x max , jj j. 

3. When X t (W G (0 — (3,0), it satisfies equation 


X t ( Tf' 3 ) = 
and consequently 


x t (w f*)-e + p 
(3 


a 


x — 0 + (3 

XtfTf 13 ) 

< max 

-^—-A -nx 

x£(0—I3,&) 

P 


A — p,Xt(TT3),\(t > 0 


^ A -|- f!0. 
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Properties (ii) and (iii) clearly imply that X t (7r' 9 ) is Lipschitz-continuous with 
constant max ^x maxi A + /x@|, independent of /3. Thus all the functions 

Xtijf 13 ) are equicontinuous (as functions of t). 

Next, we can find T s such that 

E [crfc | <Jk > T s \ P [a k > T s ] < (9) 

Clearly, as X t (n^) are equicontinuous and converging to X t (n), by the Arzela- 
Ascoli theorem X t (n^) converges to X t (jt) uniformly on interval [0, T e ]. On 
the other hand, c is continuous, decreasing and bounded, thus it is uniformly 
continuous, which means that there exists a 5 > 0 such that for any x, y such 
that \x — y\ < S we have |c(x) — c(y) \ < Using uniform convergence of 
X t (ifP) we can further conclude that there exists a (3 > 0 such that 

sup \c(X t (nP)) -c(X t (7r))| < (10) 

tG[0,T £ ] G1 e 

Now note that by the Kurtz theorem (see Theorem 5.3 in [11]), 

P[ sup | X?(TP’ n ) - X t (Tf 0 )\ > <5] < De~ nF(5) 

o <t<Te 


for some positive constant D and a function F satisfying lim^o € (0, oo). 
By this last property, the probability bounded above converges to zero as n 
goes to infinity at rate of e~ n , so for n large enough this probability is not 
bigger than g^y. 

Next, using uniform continuity of c we can write: 

£ 


m^ ,n ) - i < <5 =► - C {x t {^))\ < 

^\c"(X?(4’ n ))-c(X t (4))\<^-. 

Finally we can write 
\E[CJI(X^ n ))] -E[C k (X t (n))}\ 


8 T e 


(ID 


< E 


< 


c n (X^’ n ))-c(X t (n)) dt 


+ c(0)E [a k | (T fc > T e ] P [a k > T e 


r Te 

’ r T * 

/ \c(X t (nP))-c{X t (ir))\ dt + E 

/ \c n (X^(^’ n ))-c(X t (^))\ dt 

J 0 

Jo 


+ c(0)E [a k | a k > T e ] P[cr fc > T e ] 

< T e sup \c(X t (n p )) - c(X t (7r))| +T e 


( 12 ) 


t<T e 


sup 

t:\X?(nP>n)-X t (wP)\<6 


(13) 

|c”(Xf(7f^))-c(X t (^))| 


+ c(0)T e P[ sup | X^(W p ' n ) - X t {^)\ >S\+ c(0)E [a k \ a k > T e ] P [a k > T e 
0<t<T e 

< Te W e + W e +C( '°) Te 8T £ c(0) +C ^8c(0) " 2’ 


(14) 
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where the last inequality is a consequence of (9), (10), (11) and the bound on 
the probability that Xf(7f l3 ’ n ) and X t (jt 13 ) differ by more than S (recall that 
c(0) is the biggest value of c) 

Now we can repeat all the above considerations for policies n 13 and n) 3 - n , 
obtaining similiar inequality 

|E [C£(X t "( V 3 ’"))] - E r A .(.V, (-)))] < (15) 

To complete the proof note that Xf(j_f 3 ' n ), X™(tt) and Xf(ff^’ n ) are birth- 
death processes starting at the same xo, with the same death rate, but with 
increasing birth rates. As a consquence X™(jr l3 ' n ) is for any t > 0 stochastically 
dominated by X"(7r), which in turn is stochastically dominated by Xf(jf l3 ’ n ). 
This however implies that 




E 


> E 


c n (Xf( 1 T^ n ))dt 


> E 


{'tk~\-0'k 


c n (Xf(n n )) dt 


f t k 


rtk+crk 


C n (^ n (7f /3 > r! )) dt 


>t k 


which is equivalent to 


E [C£(X t "(7T*"))] > E [C£(X t "(7T"))] > E [C^Xf(7fP’ n ))\ ■ 


But this, together with (14) and (15) implies the thesis of the theorem. □ 


Using Theorem 9 we can immediately show that all the results proved for 
the mean-field model can be viewed as good approximations of what happens 
in the discrete case when service rates go to infinity. This is formulated in five 
corollaries below. 


Corollary 3 Suppose that the initial (normalized) state of the queue Xo € 
[0, x max ] for some fixed x rnax and that [0 , q]-threshold policies of all the players 
form an equilibrium in the fluid model with service cost c, incoming rate A and 
service rate p. Than for any e > 0 there exists an N such that for any n > N 
[nO, q]-threshold policies form e-equilibria in dicrete models M n . 

Corollary 4 Suppose all the players have the same statistical information 
about the system state p and that for some T > 0, / policies of all the players 
(where f is of one of three types: EE, NE, NN) form an equilibrium in 
the Bayesian partial information fluid model with service cost c, incoming 
rate A and service rate p. Than for any e > 0 there exists an N such that 
for any n > N f policies form e-equilibria in Bayesian partial information 
counterparts of discrete models M n . 

Corollary 5 Suppose that for some > 0, / policies of all the players (where 
f is of one of three types: EE, NE, NN) form an equilibrium in the worst- 
case partial information fluid model with service cost c, incoming rate A and 
service rate p. Than for any e > 0 there exists an N such that for any n > N 
f policies fonn e-equilibria in worst-case partial information counterparts of 
discrete models M n . 
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Corollary 6 Suppose all the players have the same statistical information 
about the system state p and that T and f policies of all the players (where f 
is of one of three types: EE, NE, NN) form an equilibrium in the hierarchical 
Bayesian partial information fluid model with service cost c, incoming rate A 
and service rate p. Than for any £ > 0 there exists an N such that for any 
n > N ntE and f policies for all the players form e-equilibria in hierarchical 
Bayesian partial information counterparts of discrete models M n . 

Corollary 7 Suppose that T and f policies of all the players (where f is of 
one of three types: EE, NE, NN) form an equilibrium in the hierarchical 
worst-case partial information fluid model with service cost c, incoming rate 
A and service rate p. Than for any £ > 0 there exists an N such that for any 
n > N nT and f policies for all the players form £-equilibria in hierarchical 
worst-case partial information counterparts of discrete models M n . 
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Iii order to solve the above equation, we use the Matlab function fsolve. 

Throughout this section we consider a = 0.2, A = 5,p = 10 for all simula¬ 
tions. Thus, p = 0.5. Also, note that 

Y [ c(u)du = y log( Q = 0.2503 (18) 

A J 0 A a 


9.1 Complete Information Game 


In this section, we numerically analyze the setting when each player has com¬ 
plete information of the game. First we numerically evaluate all the possible 
NE strategy profiles using theorem 1 


1. 7 > — = 0.5, then (9 = 0 and q = 1. Thus, players will always enter the 

ap 

queue. 

2. 7 G [0.2503, 0.5), then there are infinitely many equilibria which are of the 
following types: 

(a) 0 G [0, 0\,q = 0. 

'(b) O € [ 0 , 0 ], = 1 . 

1 - ap'y 1 - ap 7 

c) 0 = - ,q= —-- 

p'y A 7 

3. 7 G (0,0.2503), then there are infinitely many equilibria, which are of the 
following types: 

(a) If 0 < p, (which occurs when 7 > -) then there are five types of 
equilibria: 


(b) 

(c) 


- 0 = 0,q>- 

P 

1 - ap'y 1 - ap'y 

- 0 = - ,q= —,- 

PI A 7 

- 0 , q ejo, 1] 

- 0 G [0,p\,q = 0. 

- 0 G [0,p},q = 1. 

If 0 = p (which occurs when 7 = -), then either 0 = 0 and q 
or 0 and q G [0, l ] 8 

If 0 > p (which occurs when 7 < -), then either 0 = 0 ,q G 
0 = p and q = 0 . 


e {0,1} 

[ 0 , 1 ] or 


Figure 1 shows the variation of 0 and the lowest possible threshold value of 
NE strategy profile with 7. From the above characterization of NE, it is easy 
to discern that the lower threshold value is max{min{ 0 , p}, 0 } i.e. there is no 
NE with the threshold value lower than the above value. The upper threshold 
is always given by 0 i.e. there is no NE with the threshold value higher than 
0. Note that lower threshold value goes to 0 at 7 = 0.2503 and the upper 
threshold value goes to 0 at 7 = 0.5. 


In this case <9 = 1.4966 
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gamma 

Fig. 1 Variation of 0 and Lower Threshold value with 7 



Fig. 2 Variation of optimal social cost and the best case social cost with 7 for complete 
information game 


For the rest of simulation results, we consider Xo = 0.2. Note that c(p) = 7 /z 
when 7 = -. Figure 2 shows the variation of optimal social cost and the highest 
possible social cost at an equilibrium with 7 . Optimal social cost is zero when 
7 < -. But when 7 > - the optimal cost increases linearly as it is evident 


by Theorem 2. From (17) we obtain that when 7 < 0.1865, then 0 > 0.2 and 
when 7 > 0.1865, then <9 < 0.2. Thus, when 7 < 0.1865 social cost under the 
best equilibrium is zero by Theorem 4. But when 7 > 0.1865 the social cost 
under the best equilibrium is exactly the same as optimal social cost. 

Figure 3 shows the variation of optimal social cost and worst case social 
cost with 7 . Note that 7 > 0.3466, 0 > 0.2 and for 7 < 0.3466,(9 < 0.2. 
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Fig. 3 Variation of optimal social cost and the worst possible social cost at equilibrium 
with 7 for complete information game 


Thus, when 7 < 0.3466 the worst case social cost is 0 by Theorem 3. O 11 the 
other hand when 7 > 0.3466 the worst case social cost is exactly the same as 
optimal social cost. 


9.2 Partial Information Game 

In this section, we numerically evaluate the social cost when only partial in¬ 
formation is available to each player. We consider Xq = 0.2. 

We start with the section by noting from Theorem 4 and Theorem 8 that 
the variation of the best possible social cost at an equilibrium is exactly the 
same as in the complete information game. Hence, we omit the study of best 
case scenario. 

Figure 4 shows the variation of optimal social cost and the worst case social 
cost with 7 . From (17) we obtain that for 7 < 0.1865 <9 > 0.2 and 0 < 0.2 
for 7 > 0.1865. Hence, by Theorem 8 the worst case social cost is 0 when 7 < 

0.1865. Note from (18) that — c{u)du = 0.2503. Thus, by Theorem 8 worst 

A 

case social cost is equal to the optimal social cost for 7 € (0.1865,0.2503) and 
the worst case social cost becomes 0 when 7 = 0.2503 (since 0 = 0.5012 > £0 
when 7 = 0.2503). The worst case social cost is again equal to the optimal 
social cost when 7 > 0.2503. 


10 Conclusions 

We studied in this paper a congestion game in a fluid queueing network in 
which customers benefit from congestion, i.e. the cost per customer decreases 
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y 

Fig. 4 Variation of optimal social cost and the worst possible social cost at equilibrium 
with 7 for partial information game 


with the congestion. We showed that this could lead to a large number of sym¬ 
metric equilibria, all of which with a reverse threshold behavior: customers get 
in if and only if the number of queued customers exceeds the threshold. We 
computed perfect equilibria to this game and the social optimum. Further, we 
considered a model where the information provided to the players is limited to 
an indication of whether the state of the queue is above or below some thresh¬ 
old. It turned out that appropriate limitation of the information obtained by 
the players can draw the outcome of the game towards the social optimum. 
Finally we showed that one can use the equilibria policies in the fluid queue to 
approximate equilibria for discrete queues and provided numerical examples. 


References 

1. Altman E, Gaujal B, Hordijk A (2000) Admission Control in Stochastic Event Graphs. 
IEEE Automatic Control 45 (5):854-867 

2. Altman E, Jimenez T (2013) Admission Control to an M/M/1 Queue with Partial Infor¬ 
mation. Analytical and Stochastic Modeling Techniques and Applications. Lecture Notes 
in Computer Science 7984 pp. 12-21 

3. Altman E, Shimkin N (1998) Individual equilibrium and learning in processor sharing 
systems. Operations Research 46:776-784 

4. Anshelevich E, Dasgupta A, Kleinberg J, Tardos E, Wexler T, Roughgarden T (2004) The 
Price of Stability for Network Design with Fair Cost Allocation. Annual IEEE Symposium 
on Foundations of Computer Science 

5. Darroch JN, Seneta E (1965) On quasi-stationary distributions in absorbing discrete-time 
finite Markov chains. Journal of Applied Probability 

6 . Hordijk A, Spieksma F (1989) Constrained Admission Control to a Queueing System. 
Ann. Appl. Probability 21:409-431 

7. Hsiao MT, Lazar AA (1991) Optimal decentralized flow control of Markovian queueing 
networks with multiple controllers. Performance Evaluation 13:181-204 

8 . Korilis YA, Lazar A (1995) On the existence of equilibria in noncooperative optimal flow 
control. Journal of the ACM 42 (3):584-613 




Mean-field Game Approach to Admission Control of an M/M/oo Queue 


33 


9. Koutsoupias E, Papadimitriou CH (1999) Worst-case equilibria. In 16th Annual Sym¬ 
posium on Theoretical Aspects of Computer Science, pp. 404-413, Trier, Germany, 4-6 
March 1999. 

10. Naor P (1969) On the Regulation of Queueing Size by Levying Tolls. Econometrica 
37:15-24 

11. Schwartz A, Weiss A (1995) Large Deviations for Performance Analysis. Chapman &■ 
Hall, London 

12. Stidham S (1985) Optimal control of admission to a queueing system. IEEE Transactions 
on Automatic Control 30:705-713 

13. Stidham S, Rajagopal S, Kulkarni VG (1995) Optimal flow control of a stochastic fluid- 
flow system. IEEE Journal on Selected Areas in Communications 13:1219-1228 

14. Stidham S, Weber RR (1989) Monotonic and Insensitive Optimal Policies for Control 
of Queues with Undiscounted Costs. Operations Research 37:611-625 

15. Tembine H, Le Boudec JY, El Azouzi R, Altman E (2009) From mean field interaction 
to evolutionary game dynamics. WiOpt 2009 

16. Wiecek P, Altman E, Ghosh A (2014) Mean-field Game Approach to Admission Control 
of an M/M/oo Queue with Decreasing Congestion Cost. 7th International Conference on 
NETwork Games COntrol and Optimization (NETGCOOP 2014), Oct 2014, Trento, Italy. 
2014 

17. Yechiali U (1971) On Optimal Balking Rules and Toll Charges in a GI\M\l Queueing 
Process. Operations Research 19:349-370 



